arXiv: 1505.00218v 1 [cs.CV] 1 May 2015 


Y. Boykov, H. Isack, C. Olsson, I.B. Ayed, arXiv:xxxx, May 2015 


pi 


Volumetric Bias in Segmentation and Reconstruction: Secrets and Solutions 

Yuri Boykov Hossam Isack Carl Olsson Ismail Ben Ayed 


Centre for Math. Sciences 
Lund University, Sweden 

calle@maths.lth.se 


Medical Biophysics 
UWO, Canada 

ibenayed@uwo.ca 


Computer Science 
UWO, Canada 

yuri@csd.uwo.ca habdelka@csd.uwo.ca 


Abstract 

Many standard optimization methods for segmentation 
and reconstruction compute ML model estimates for ap¬ 
pearance or geometry of segments, e.g. Zhu-Yuille [21], 
Torr [18], Chan-Vese [6], GrabCut [16], Delong et al. [8]. 
We observe that the standard likelihood term in these formu¬ 
lations corresponds to a generalized probabilistic K-means 
energy. In learning it is well known that this energy has a 
strong bias to clusters of equal size, which can be expressed 
as a penalty for KL divergence from a uniform distribution 
of cardinalities [10]. However, this volumetric bias has 
been mostly ignored in computer vision. We demonstrate 
significant artifacts in standard segmentation and recon¬ 
struction methods due to this bias. Moreover, we propose 
binary and multi-label optimization techniques that either 
(a) remove this bias or (b) replace it by a KL divergence 
term for any given target volume distribution. Our general 
ideas apply to many continuous or discrete energy formu¬ 
lations in segmentation, stereo, and other reconstruction 
problems. 

1. Introduction 

Most problems in computer vision are ill-posed and opti¬ 
mization of regularization functionals is critical for the area. 
In the last decades the community developed many practi¬ 
cal energy functionals and efficient methods for optimizing 
them. This paper analyses a widely used general class of 
segmentation energies motivated by Bayesian analysis, dis¬ 
crete graphical models (e.g. MRF/CRF), information theory 
(e.g. MDL), or continuous geometric formulations. Typical 
examples in this class of energies include a log-likelihood 
term for models P k assigned to image segments S k 

K 

£(S,P) = -^ ^ logP fe (I p ), « 

k=i P es k 

where, for simplicity, we focus on a discrete formulation 
with data I p for a finite set of pixels/features p G Ll and seg- 


Secrets (1) Solutions (6), (9-10) 


(a) GrabCut [16] with unbiased data term (10) 



(c) Chan-Vese [6] + [7] with target volumes (6) 


Figure 1. Left: segmentation and stereo reconstruction with stan¬ 
dard likelihoods or probabilistic K-means energy E(S , P) in (1) 
has bias to equal size segments (2). Right: (a-b) corrections due to 
unbiased data term E(S, P) in (9,10) or (c) weighted likelihoods 
Ew (S, P) in (6) biased to proper target volumes, see (5). Sections 
3.1-3.3 explain these examples in details. 


ments S k = {p G fl\S p = k} defined by variables/labels 
S p G N indicating the segment index assigned to p. In dif¬ 
ferent vision problems models P k could represent Gaussian 
intensity models [6], color histograms [2], GMM [21, 16], 
or geometric models [18, 8, ] like lines, planes, homogra- 
phies, or fundamental matrices. 
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Depending on application, the energies combine likeli¬ 
hoods (1), a.k.a. data term, with different regularization po¬ 
tentials for segments S k . One of the most standard regular¬ 
izes is the Potts potential, as in the following energy 

K 

Epotts(S , P) = - EE logP fe (J p ) + A-||dS||, 

k=i P es k 

where \ \dS\ \ is the number of label discontinuities between 
neighboring points p on a given neighborhood graph or the 
length of the segmentation boundary in the image grid [3 ]. 
Another common regularizer is sparsity or label cost for 
each model P k with non-zero support [18, 21, 1,8], e.g. 

K 

E sp (s,p) = ~y2 E io g p fc (4)+7-E^ 0 ]- 

k=lpeS k k 

In general, energies often combine likelihoods (1) with mul¬ 
tiple different regularizers at the same time. 

This paper demonstrates practically significant bias to 
equal size segments in standard energies when models P = 
{P k } are treated as variables jointly estimated with seg¬ 
mentation S = {S^}. This problem comes from likeli¬ 
hood term (1), which we interpret as probabilistic K-means 
energy carefully analyzed in [10] from an information the¬ 
oretic point of view. In particular, [] ] decomposes energy 
(1) as 1 


K 

E(S, P) = y^|5 fe |-i4'L(7 fc |P fe ) + |Q|-(P(5|/)-P(5)) 

k=l 


where KL(I k \P k ) is KL divergence for model P k and the 
true distribution 2 of data I k = {I p \ p G S k } in segment k. 
Conditional entropy H(S\I) penalizes “non-deterministic” 
segmentation if variables S p are not completely determined 
by intensities I p . The last term is negative entropy of seg¬ 
mentation variables — H(S ), which can be seen as KL di¬ 
vergence 

-H(S) ± KL(S\U) := (2) 


between the volume distribution for segmentation S 

y \syi\ 

s ' ipr ini""’ m I 


( 3 ) 


and a uniform distribution U = Thus, this 

term represents volumetric bias to equal size segments S k . 
Its minimum is achieved for cardinalities \S k \ = ^. 


1 Symbol = represents equality up to an additive constant. 

2 The decomposition above applies to either discrete or continuous 

probability models (e.g. histogram vs. Gaussian). The continuous case 

relies on Monte-Carlo estimation of the integrals over “true” data density. 


1.1. Contributions 

Our experiments demonstrate that volumetric bias in 
probabilistic K-means energy (1) leads to practically signif¬ 
icant artifacts for problems in computer vision, where this 
term is widely used for model fitting in combination with 
different regularizers, e.g. [21, 18, 6, 16, 8]. Section 2 pro¬ 
poses several ways to address this bias. 

First, we show how to remove the volumetric bias. This 
could be achieved by adding extra term | £21 • H(S) to any en¬ 
ergy with likelihoods (1) exactly compensating for the bias. 
We discuss several efficient optimization techniques appli¬ 
cable to this high-order energy term in continuous and/or 
discrete formulations: iterative bound optimization, exact 
optimization for binary discrete problems, and approximate 
optimization for multi-label problems using ^-expansion 
[5]. It is not too surprising that there are efficient solvers for 
the proposed correction term since H(S) is a concave cardi¬ 
nality function, which is known to be submodular for binary 
problems [14]. Such terms have been addressed previously, 
in a different context, in the vision literature [11, 17]. 

Second, we show that the volumetric bias to uniform dis¬ 
tribution could be replaced by a bias to any given target dis¬ 
tribution of cardinalities 

W — {w 1 , w 2 ,..., w K }. (4) 

In particular, introducing weights w k for log-likelihoods in 
(1) replaces bias KL(S\U) as in (2) by divergence between 
segment volumes and desired target distribution W 

*W) = £ ! g ! (5) 

Our experiments in supervised or unsupervised segmen¬ 
tation and in stereo reconstruction demonstrate that both 
approaches to managing volumetric bias in (1) can signif¬ 
icantly improve the robustness of many energy-based meth¬ 
ods for computer vision. 

2. Log-likelihood energy formulations 

This section has two goals. First, we present weighted 
likelihood energy Ew{S,P) in (6) and show in (8) that 
its volumetric bias is defined by KL(S\W). Standard data 
term E(P, S) in (1) is a special case with W = U. Then, we 
present another modification of likelihood energy E(S, P) 
in (9) and prove that it does not have volumetric bias. Note 
that [10] also discussed unbiased energy E. The analysis 
of E below is needed for completeness and to devise opti¬ 
mization for problems in vision where likelihoods are only 
a part of the objective function. 

Weighted likelihoods: Consider energy 

K 

E W (S, P) := - E E • pfe (4))> ( 6 ) 

k=lp£S k 
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which could be motivated by a Bayesian interpretation [! ] 
where weights W explicitly come from a volumetric prior. 
It is easy to see that 

K 

E W (S, P) = E(S,P)-J2 \S k \-logw k (7) 

k =1 

= E(S,P) + \Sl\-H(S\W) 

where H(S\W) is a cross entropy between distributions Vs 
and W. As discussed in the introduction, the analysis of 
probabilistic K-means energy E(S, P) in [1 ] implies that 

K 

E W (S, P) ± ^|5 fe |-ifi(7 fe |P fc ) + |fi|-7f(5|J) 

k=1 

- |J1| • H(S) + \Q\ • H(S\W). 



Figure 2. (Entropy - bound optimization ) According to (7,10) 
energy Ew t (S : P) is a bound for E(S,P) since cross entropy 
H(S\Wt) is a bound for entropy H(S) with equality at S = St. 
This standard fact is easy to check: function — z log z (blue curve) 
is concave and its lst-order approximation at z t = w k — |Sf |/|Q| 
(red line) is a tight upper-bound or surrogate function [ 13 ]. 


Combining two terms in the second line gives 

K 

E W (S,P) = ^2 l^l ' KL(I k \P k ) + |fi| • H(S\I) 

k=l 

+ \n\ -KL(S\W). ( 8 ) 

In case of given weights W equation (8) implies that 
weighted likelihood term (6) has bias to the target volume 
distribution represented by KL divergence (5). 

Note that optimization of weighted likelihood term (6) 
presents no extra difficulty for regularization methods in vi¬ 
sion. Fixed weights W contribute unary potentials for seg¬ 
mentation variables S p , see (7), which are trivial for stan¬ 
dard discrete or continuous optimization methods. Never¬ 
theless, examples in Sec. 3 show that indirect minimization 
of KL divergence (5) substantially improves the results in 
applications if (approximate) target volumes W are known. 

Unbiased data term: If weights W are treated as un¬ 
known parameters in likelihood energy (6) they can be op¬ 
timized out. In this case decomposition (8) implies that the 
corresponding energy has no volumetric bias: 

E(S,P) := min E W (S,P) (9) 

w 

K 

= y^|S ,A: |- J PL(/ fc |P A: ) + |!7|-i7(5|/). 

k =1 

Weights Vs in (3) are ML estimate of W that minimize 
(8) by achieving KL(S\W) = 0. Putting optimal weights 
W = Vs into (7) confirms that volumetrically unbiased data 
term (9) is a combination of standard likelihoods (1) with a 
high-order correction term H(S ): 


E(S,P) 


E(S,P)-jf\S k \-log l -Q 

k =1 * I 

E{S,P) + \n\-H{S). ( 10 ) 



Figure 3. (Entropy - high order optimization ) (a) polygonal ap¬ 
proximation for — z log z. (b) “triangle” functions decomposition. 


Note that unbiased data term E(S,P) should be used 
with caution in applications where allowed models P k are 
highly descriptive. In particular, this applies to Zhu&Yuille 
[21] and GrabCut [16] where probability models are his¬ 
tograms or GMM. According to (9), optimization of model 
P k will over-fit to data, i.e. KL(I k \P k ) will be reduced 
to zero for arbitrary I k = {I p \ p G S k }. Thus, 
highly descriptive models reduce E(S,P) to conditional 
entropy H(S\I), which only encourages consistent label¬ 
ing for points of the same color. While this could be useful 
in segmentation, see bin consistency in [17], trivial solution 
S° = £2 becomes good for energy E(S,P). Thus, bias to 
equal size segments in standard likelihoods (1) is important 
for histogram or GMM fitting methods [21, 16]. 

Many techniques with unbiased data term E(S, P) avoid 
trivial solutions. Over-fitting is not a problem for simple 
models, e.g. Gaussians [6], lines, homographies [18, 8]. La¬ 
bel cost could be used to limit model complexity. Trivial so¬ 
lutions could also be removed by specialized regional terms 
added to the energy [17]. Indirectly, optimization methods 
that stop at a local minimum help as well. 

Bound optimization for (9-10): One local optimiza¬ 
tion approach for E(S,P) uses iterative minimization of 
weights W for E W (S,P). According to (8) the op¬ 
timal weights at any current solution S t are W t = 

{nrlr ’ ■■■> fnf} since they minimize KL(S t \W). The al- 
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gorithm iteratively optimizes Ew t ( S , P) over P, S and re¬ 
sets to energy E\y t+1 (S, P ) at each step until convergence. 
This block-coordinate descent can be seen as bound opti¬ 
mization [1 ]. Indeed, see Figure 2 , at any given S t energy 
Ew t (S, P) is an upper bound for E(S,P), that is 


variables x p E {0,1} represent ^-expansion from a current 
solution St = {St} to a new solution S. Since 


\s k \ 


Y pe n x P’ if k = a 

Ypes? (1 — x p)y if k 7 ^ a 


(14) 


E(S,P) < E Wt (S,P) VS 

E(St,P ) = E Wt (S t ,P). 

This bound optimization approach to E(S, P) is a trivial 
modification for any standard optimization algorithm for 
energies with unary likelihood term Ew{S , P) in (6). 

High-order optimization for entropy in (9-10): Al¬ 
ternatively, optimization of unbiased term E(S,P) could 
be based on equation ( 10 ). Since term E(S,P) is unary 
for S the only issue is optimization of high-order entropy 
H(S). The entropy is a combination of terms —z log 2 for 
z = | S k | /1 Q |. Each of these is a concave function of car¬ 
dinality, which are known to be submodular [] ]. As ex¬ 
plained below, entropy is amenable to efficient discrete opti¬ 
mization techniques both in binary (Sec. 3. 2) and multi-label 
cases (Sec. 3.2-3. 3). 

Optimization of concave cardinality functions was pre¬ 
viously proposed in vision for label consistency [11], bin 
consistency [17], and other applications. Below, we discuss 
similar optimization methods in the context of entropy. We 
use a polygonal approximation with triangle functions as il¬ 
lustrated in Figure 3. Each triangle function is the minimum 
of two affine cardinality functions, yielding an approxima¬ 
tion of the type 


( 12 ) also reduces to submodular pairwise terms for yu x p . 

The presented high-order optimization approach makes 
stronger moves than the simpler bound optimization method 
in the previous sub-section. However, both methods use 
block coordinate descent iterating optimization of S and P 
with no quality guarantees. The next section shows exam¬ 
ples with different optimization methods. 

3. Examples 

This sections considers several representative examples 
of computer vision problems where regularization energy 
uses likelihood term (1) with re-estimated models P k . We 
empirically demonstrate bias to segments of the same size 
( 2 ) and show advantages of different modifications of the 
data term proposed in the previous section. 

3.1. Segmentation with target volumes 

In this section we consider a biomedical example with 
K = 3 segments: Si background, S 2 liver, S 3 substructure 
inside liver (blood vessels or cancer), see Fig.4. The en¬ 
ergy combines standard data term E(S,P) from ( 1 ), bound¬ 
ary length ||<9S||, an inclusion constraint S 3 C S 2 , and a 
penalty for L 2 distance between the background segment 
and a given shape template T, as follows 


(n) 

Optimization of each “triangle” term in this summation can 
be done as follows. Cardinality functions like a^\S k | and 
af | S k | + bf are unary. Evaluation of their minimum can 
be done with an auxiliary variable yi E {0,1} as in 

min yi{a[\S k |) + yi{aY\S k | + bf) (12) 
yi 

which is a pairwise energy. Indeed, consider binary seg¬ 
mentation problems S p E {0,1}. Since 


\s k \ = 


E P eQ 

— S p ), 


if k = 1 
if k = 0 


(13) 


( 12 ) breaks into submodular 3 pairwise terms for yi and S p . 
Thus, each “triangle” energy (12) can be globally optimized 
with graph cuts [12]. For more general multi-label problems 
S p E N energy terms (12) can be iteratively optimized via 
binary graph-cut moves like ^-expansion [5]. Indeed, let 

3 Depending on k, may need to switch yi and yi . 


E{S,P) + \\\dS\\ + [S s C S 2 ] +0||Si -T|| L2 . (15) 

For fixed models P k this energy can be globally minimized 
over S as described in [7]. In this example intensity like¬ 
lihood models P k are histograms treated as unknown pa¬ 
rameters and estimated using block-coordinate descent for 
variables S and P. Figure 4 compares optimization of (15) 
in (b) with optimization of a modified energy replacing stan¬ 
dard likelihoods E(S, P) with a weighted data term in ( 6 ) 

Ew(S,P) + Alias'll + [S 3 C S 2 ] + P\\Si - T\\ L2 (16) 

for fixed weights W set from specific target volumes (c-d). 

The teaser in Figure 1(c) demonstrates a similar example 
for separating a kidney from a liver based on Gaussian mod¬ 
els P k , as in Chan-Vese [ 6 ], instead of histograms. Standard 
likelihoods E(P,S) in (15) show equal-size bias, which is 
corrected by weighted likelihoods Ew{P,S) in (16) with 
approximate target volumes W = {0.05,0.95}. 

3.2. Segmentation without volumetric bias 

We demonstrate in different applications a practically 
significant effect of removing the volumetric bias, i.e., us¬ 
ing our functional E(S, P). We first report comprehensive 
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(a) initial models (b) segmentation for W = U (c) for W = {0.04, 0.96} (d) for W = {0.75, 0.25} 



Figure 4. Equal volumes bias KL(S\U) versus target volumes bias KL(S\W). Grey histogram is a distribution of intensities for the 
ground truth liver segment including normal liver tissue (the main mode), blood vessels (the small mode on the right), and cancer tissue 
(the left mode), (a) Initial (normalized) histograms for two liver parts. Initial segmentation shows which histogram has larger value for 
each pixel’s intensity, (b) The result of optimizing energy (15). The solid blue and green histograms at the bottom row are for intensities at 
the corresponding segments, (c-d) The results of optimizing energy (16) for fixed weights W set for specific target volumes. 


comparisons of binary segmentations on the GrabCut data 
set [16], which consists of 50 color images with ground- 
truth segmentations and user-provided bounding boxes 4 . 
We compared three energies: high-order energy E(S,P) 

( 10 ) , standard likelihoods E(S,P) (1), which was used in 
the well-known GrabCut algorithm [16], and Ew(S,P) 
( 6 ), which constrains the solution with true target volumes 
(i.e., those computed from ground truth). The appearance 
models in each energy were based on histograms encoded 
by 16 bins per channel, and the image data is based color 
specified in RGB coordinates. For each energy, we added 
a standard contrast-sensitive regularization term [16, 2 ]: 
A J2 p , q ej\f a pq[Sp S<l\’ where a pq denote standard pair¬ 
wise weights determined by color contrast and spatial dis¬ 
tance between neighboring pixels p and q [16, 2 ]. M is the 
set neighboring pixels in a 8 -connected grid. 

We further evaluated two different optimization schemes 
for high-order energy E(S, P): (i) bound optimization and 

(11) high-order optimization of concave cardinality potential 
H(S) using polygonal approximations; see Sec . 2 for de¬ 
tails. Each energy is optimized by alternating two iterative 
steps: (i) fixing the appearance histogram models and opti¬ 
mizing the energy w.r.t S using graph cut [4]; and (ii) fix¬ 
ing segmentation S and updating the histograms from cur¬ 
rent solution. For all methods we used the same appearance 
model initialization based on a user-provided box 5 . 

4 http://research.microsoft.com/en-us/um/cambridge/projects 

/visionimagevideoediting/segmentation/grabcut.htm 

5 The data set comes with two boxes enclosing the foreground segment 
for each image. We used the outer bounding box to restrict the image 
domain and the inner box to compute initial appearance models. 


The error is evaluated as the percentage of mis-classified 
pixels with respect to the ground truth. Table 1 reports the 
best average error over A E [1... 30] for each method. As 
expected, using the true target volumes yields the lowest er¬ 
ror. The second best performance was obtained by E(S, P) 
with high-order optimization; removing the volumetric bias 
substantially improves the performance of standards log- 
likelihoods reducing the error by 6 %. The bound optimiza¬ 
tion obtains only a small improvement as it is more likely 
to get stuck in weak local minima. We further show repre¬ 
sentative examples for A = 16 in the last two rows of Table 
1 , which illustrate clearly the effect of both equal-size bias 
in ( 1 ) and the corrections we proposed in ( 10 ) and ( 6 ). 

It is worth noting that the error we obtained for stan¬ 
dard likelihoods (the last column in Table 1) is significantly 
higher than the 8 % error previously reported in the litera¬ 
ture, e.g., [19]. The lower error in [19] is based on a differ¬ 
ent (more recent) set of tighter bounding boxes [1 ], where 
the size of the ground-truth segment is roughly half the size 
of the box. Therefore, the equal-size bias in E(S, P) ( 10 ) 
for this particular set of boxes has an effect similar to the 
effect of true target volumes W in Ew(S , P) ( 6 ) (the first 
column in Table 1), which significantly improves the per¬ 
formance of standard likelihoods (the last column). In prac¬ 
tice, both 50/50 boxes and true W are equally unrealistic 
assumptions that require knowledge of the ground truth. 

Fig. 5 depicts a different application, where we segment 
a magnetic resonance image (MRI) of the brain into mul¬ 
tiple regions (K > 2). Here we introduce an extension of 
E(S,P) using a positive factor 7 that weighs the contribu- 
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Energy 


E W (S,P) (6) 
true target volumes W 


E(S,P) (10) E(S,P) (9) 

high-order optimization bound optimization 


E(S,P) (1) 
standard likelihoods 


Overall Error 

(50 images) 


5.29% 


7.87% 


13.41% 


14.41% 



error: 4.75% 



error: 6.85% 



error: 9.64% 



error: 14.69% 


Examples 



error: 2.29% 



error: 4.95% 



error: 41.20% 



error: 40.88% 


Table 1. Comparisons on the GrabCut data set. 


tion of entropy against the other terms: 

Ey(S, P ) = E(S, P ) + 7 |fi| H(S). (17) 

This energy could be written as 

K 

Y \S k \KL(I k \P k ) + \n\H(S\I) + (7 - 

k =1 

using the high-order decomposition of likelihoods E(S,P) 
from [10] presented in the intro. Thus, the bias introduced 
by H(S) has two cases: 7 < 1 (volumetric equality bias) 
and 7 > 1 (volumetric disparity bias), as discussed below. 

We used the Chan-Vese data term [6], which assumes the 
appearance models in E(S,P) are Gaussian distributions: 
— log P k (I p ) = (I p — p k ) 2 /2 cr 2 , with p k the mean of in¬ 
tensities within segment S k and a is fixed for all segments. 
We further added a standard total-variation term [20] that 
encourages boundary smoothness. 

The solution is sought following the bound optimization 
strategy we discussed earlier; See Fig. 2. The algorithm al¬ 
ternates between two iterative steps: (i) optimizing a bound 
of E 7 (S, P) w.r.t segmentation S via a continuous convex- 
relaxation technique [20] while model parameters are fixed, 
and (ii) fix segmentation S and update parameters p k and 
w k using current solution. We set the initial number of mod¬ 
els to 5 and fixed A = 0.1 and a = 0.05. We run the method 
for 7 = 0, 7 = 1 and 7 = 3 . Fig. 5 displays the results 
using colors encoded by the region means obtained at con¬ 
vergence. Column (a) demonstrates the equal-size bias for 
7 = 0; notice that the yellow, red and brown components 
have approximately the same size. Setting 7 = 1 in (b) re¬ 
moved this bias, yielding much larger discrepancies in size 


between these components. In (c) we show that using large 
weight 7 in energy (17) has a sparsity effect; it reduced the 
number of distinct segments/labels from 5 to 3. At the same 
time, for 7 > 1 , this energy introduces disparity bias; notice 
the gap between the volumes of orange and brown segments 
has increased compared to 7 = 1 in (b), where there was no 
volumetric bias. This disparity bias is opposite to the equal¬ 
ity bias for 7 < 1 in (a). 

3.3. Geometric model fitting 

Energy minimization methods for geometric model fit¬ 
ting problems have recently gained popularity due to [9]. 
Similarly to segmentation these methods are often driven by 
a maximum likelihood based data term measuring model fit 
to the particular feature. The theory presented in Section 2 
applies to these problems as well and they therefore exhibit 
the same kind of volumetric bias. 


11 n 

D.5 — 



Figure 6. 3D-Geometry of the book scene in Figure 1 (b). 

Figures 1 (b) shows a simple homography estimation ex¬ 
ample. Here we captured two images of a scene with two 
planes and tried to fit homographies to these (the right im¬ 
age with results is shown in Figure 1). For this image pair 
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7 < 1 7 = 1 7 > 1 

image equality bias no bias disparity bias 



Figure 5. Segmentation using energy (17) combined with a standard total-variation regularization [ 0]. We used the Chan-Vese model 
[6] as appearance term and bound optimization to compute a local minimum of the energy (See Fig. 2). At each iteration, the bound is 
optimized w.r.t segmentation using the convex-relaxation technique in [20]. Initial number of models: 5. A = 0.1, o = 0.05. Upper row 
(from left to right): image data and the results for 7 = 0, 7 = 1 and 7 = 3 . Lower row: histograms of the number of assignments to each 
label and the entropies obtained at convergence. 


SIFT [15] generated 3376 matches on the larger plane (pa¬ 
per and floor) and 135 matches on the smaller plane (book). 
For a pair of matching points I p = {x p: y p } we use the log 
likelihood costs 

]T-log (w k -P H ^(I p )), (18) 

P es k 

where P Hk ^ k (I p ) = - 1 . - e~^ dH kk( x p^p ) 2 and 

(27r) 2 y|£ fc | 

is the symmetric mahalanobis transfer distance. The 
solution to the left in Figure 1 (b) was generated by optimiz¬ 
ing over homographies and covariances while keeping the 
priors fixed and equal ( w 1 = w 2 = 0.5). The volume bias 
makes the smaller plane (blue points) grab points from the 
larger plane. For comparison Figure 1 (b) also shows the 
result obtained when reestimating w 1 and w 2 . Note that the 
two algorithms were started with the same homographies 
and covariances. Figure 6 shows an independently com¬ 
puted 3D reconstruction using the same matches as for the 
homography experiment. 

3.3.1 Multi Model Fitting 

Recently discrete energy minimization formulations have 
been shown to be effective for geometric model fitting tasks 
[9, 8]. These methods effectively handle regularization 
terms needed to produce visually appealing results. The 


typical objective functions are of the type 

E(S, W, 0) = V(S) + D(S, W, 0) + L(S), (19) 

where V(S) = J2( p , g )eJV Vpq($pi S q ) is a. smoothness term 
and L (S) is a label cost preventing over fitting by penalizing 
the number of labels. The data term 

D(S, W,0) = -YY M^) + P(m p \e k )) (20) 

k S p—k 

consists of log-likelihoods for the observed measurements 
m p , given the model parameters 0. Typically the prior dis¬ 
tributions w k are ignored (which is equivalent to letting all 
w k be equal) hence resulting in a bias to equal partitioning. 
Because of the smoothness and label cost terms the bias is 
not as evident in practical model fitting applications as in 
k-means, but as we shall see it is still present. 

Multi model fitting with variable priors presents an addi¬ 
tional challenge. The PEARL (Propose, Expand And Rees¬ 
timate Labels) paradigm [9] naturally introduces and re¬ 
moves models during optimization. However, when rees¬ 
timating priors, a model k that is not in the current labeling 
will have w k = 0 giving an infinite log-likelihood penalty. 
Therefore a simple alternating approach (see bound opti¬ 
mization in Sec. 2) will be unable to add new models to 
the solution. For sets of small cardinality it can further be 
seen that the entropy bound in Figure 2 will become pro¬ 
hibitively large since the derivative of the entropy function 
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(a) (b) 



Figure 7. Line fitting: (a) data generated from three lines, (b) data 
with outliers, (c) fixed W and h = 100, (d) fixed W and h = 200, 
(e) fixed W and h = 300, (f) variable W and h — 5. 

is unbounded (when approaching w = 0). Instead we use 
a-expansion moves with higher order interactions to handle 
the entropy term, as described in Section 2. 

Figure 7 shows the result of a synthetic line fitting exper¬ 
iment. Here we randomly sampled points from four lines 
with different probabilities, added noise with cr = 0.025 
and added outliers. We used energy (19) without smooth¬ 
ness and with label cost h times the number of labels (ex¬ 
cluding the outlier label). The model parameters 0 consist 
of line location and orientation. We treated the noise level 
for each line as known. Although the volume bias seems 
to manifest itself more clearly when the variance is reesti¬ 
mated, it is also present when only the means are estimated. 

Using random sampling we generated 200 line proposals 
to be used by both methods (fixed and variable W). Figure 7 
(c), (d) and (e) show the results with fixed W for three dif¬ 
ferent strengths of label cost. Both the label cost and the en¬ 
tropy term want to remove models with few assigned points. 
However, the label cost does not favor any assignment when 
it is not strong enough to remove a model. Therefore it can¬ 
not counter the volume bias of the standard data term favor¬ 
ing more assignments to weaker models. In the line fitting 
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Figure 8. Homography fitting : fixed {left) and variable W {right). 



Figure 9. Histogram of the number of assignments to each label 
(model) in Figure 8. Fixed W {left) and variable W {right). 


experiment of Figure 7 we varied the strength of the label 
cost (three settings shown in (c), (d) and (e)) without being 
able to correctly find all the 4 lines. Reestimation of W in 
Figure 7 (f) resulted in a better solution. 

Figures 8 and 9 show the results of a homography esti¬ 
mation problem with the smoothness term V(S). For the 
smoothness term we followed [9] and created edges using 
a Delauney triangulation with weights e~ d / 5 , where d is 
the distance between the points. For the label costs we used 
h = 100 with fixed W and h = 5 with variable W. We 
fixed the model variance to 5 2 (pixels 2 ). 

The two solutions are displayed in Figure 8 and Figure 9 
shows a histogram of the number of assigned points to each 
model (black corresponds to the outlier label). Even though 
smoothness and label costs mask it somewhat, the bias to 
equal volume can be seen here as well. 

4. Conclusions 

We demonstrated significant artifacts in standard seg¬ 
mentation and reconstruction methods due to bias to equal 
size segments in standard likelihoods (1) following from the 
general information theoretic analysis [10]. We proposed 
binary and multi-label optimization methods that either (a) 
remove this bias or (b) replace it by a KL divergence term 
for any given target volume distribution. Our general ideas 
apply to many continuous or discrete problem formulations. 
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